Fixes #26670: Resolve QuickSight column aliases to correct upstream column in lineage#27301
Conversation
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
There was a problem hiding this comment.
Pull request overview
Resolves incorrect QuickSight CustomSql column-level lineage when projected columns use aliases (e.g., SELECT id AS relation_id) by leveraging LineageParser.column_lineage mappings instead of falling back to name-based matching.
Changes:
- Added
_build_column_lineage_from_parser()to buildColumnLineageentries from SQL parser column mappings with parent-table filtering and fallback behavior. - Updated
_yield_lineage_from_query()to use the new parser-based column lineage builder. - Added unit tests covering alias resolution, multi-table filtering, and fallback behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
ingestion/src/metadata/ingestion/source/dashboard/quicksight/metadata.py |
Builds column-level lineage using SQL parser alias mappings and uses it during lineage emission. |
ingestion/tests/unit/topology/dashboard/test_quicksight.py |
Adds unit tests to validate alias resolution, multi-table filtering, and fallback behavior. |
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
|
All bot review comments from Copilot and Gitar Bot have been addressed in commit 106c664:
@harshach @pmbrull could you please add the |
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
| @pytest.mark.order(12) | ||
| def test_build_column_lineage_from_parser_iterable_parent(self): | ||
| """ | ||
| When src_col._parent is an iterable of parent tables (as some | ||
| parser outputs produce), _build_column_lineage_from_parser must |
There was a problem hiding this comment.
The new tests use @pytest.mark.order(12) and then later @pytest.mark.order(11), which makes the ordered test sequence harder to follow and can be confusing when diagnosing failures. Consider keeping the order markers monotonically increasing in the file (swap these two order values, or reorder the tests to match).
There was a problem hiding this comment.
Done. Corrected the @pytest.mark.order tags to maintain a monotonically increasing sequence.
|
@mohitjeswani01 could you take a look at copilots last 2 comments? |
yes @PubChimps i will definitely fix those and also check the CI failures! |
b501e67 to
edfdf80
Compare
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
1 similar comment
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
|
@PubChimps I have addressed the Copilot bot findings. The fallback logic in metadata.py has been updated to prevent manufacturing incorrect cross-table lineage, and the corresponding regression test is added. The pytest order markers in test_quicksight.py are also corrected to a monotonic sequence. The branch is rebased and ready. Could you please add the |
… also fix bot comments
…orrect pytest ordering
…iedEntityName wrapping
7735d55 to
9ae5cbc
Compare
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
Code Review ✅ Approved 2 resolved / 2 findingsQuickSight column aliases are now correctly resolved to their upstream lineage, addressing issues with parent-table filters and incomplete test coverage. No further issues found. ✅ 2 resolved✅ Edge Case: No parent-table filter when _parent is None in multi-table SQL
✅ Bug: Test doesn't exercise parent-table name comparison it claims to
OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
|
@PubChimps , @harshach sir i have solved the merge conflicts also resolved bots comments and ran the tests locally could you please add a
|

Fixes #26670
Description
I worked on this because QuickSight datasets using CustomSql with column
aliases (e.g.
SELECT id AS relation_id) were producing incorrectcolumn-level lineage. OpenMetadata was matching by column name instead
of tracing the alias back through the SQL expression — so
relation_idwould get linked to any other upstream column named
relation_idin thecatalog, rather than the actual source column
id.The fix was simpler than it looks —
LineageParserwas already beinginstantiated with the CustomSql query in
_yield_lineage_from_query(),but its
column_lineageproperty was never used for column-level lineage.The code was falling back to name-based matching instead.
What I did:
_build_column_lineage_from_parser()method that useslineage_parser.column_lineageto resolve alias mappings correctlysrc_col._parentto filter by parent table — prevents wronglineage when multiple upstream tables share the same column name
raw_namevia.split(".")[-1]to handlefully-qualified column names from the parser
len(col_pair) < 2)no column lineage (SQL too complex or parsing failed)
get_column_fqnimport which was only in the base class beforeType of change:
Checklist:
I have read the CONTRIBUTING document.
My PR title is
Fixes #26670: Resolve QuickSight column aliases to correct upstream column in lineageI have commented on my code, particularly in hard-to-understand areas.
For JSON Schema changes: I updated the migration scripts or explained why it is not needed.
I have added a test that covers the exact scenario we are fixing.
Tests added (
test_quicksight.py):test_build_column_lineage_from_parser_resolves_alias— verifies alias resolutiontest_build_column_lineage_from_parser_multi_table_filters_correctly— verifies parent table filtering prevents wrong lineage in multi-table joinstest_build_column_lineage_from_parser_falls_back_when_empty— verifies graceful fallback to name-based matching